Discriminative Word Alignment via Alignment Matrix Modeling

نویسندگان

  • Jan Niehues
  • Stephan Vogel
چکیده

In this paper a new discriminative word alignment method is presented. This approach models directly the alignment matrix by a conditional random field (CRF) and so no restrictions to the alignments have to be made. Furthermore, it is easy to add features and so all available information can be used. Since the structure of the CRFs can get complex, the inference can only be done approximately and the standard algorithms had to be adapted. In addition, different methods to train the model have been developed. Using this approach the alignment quality could be improved by up to 23 percent for 3 different language pairs compared to a combination of both IBM4alignments. Furthermore the word alignment was used to generate new phrase tables. These could improve the translation quality significantly.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discriminative Modeling of Extraction Sets for Machine Translation

We present a discriminative model that directly predicts which set of phrasal translation rules should be extracted from a sentence pair. Our model scores extraction sets: nested collections of all the overlapping phrase pairs consistent with an underlying word alignment. Extraction set models provide two principle advantages over word-factored alignment models. First, we can incorporate featur...

متن کامل

Search for Discriminative Word Alignment via Dual Decomposition

Shiqi Shen, Yang Liu and Maosong Sun (Department of Computer Science and Technology, State Key Lab on Intelligent Technology and Systems, Tsinghua University, Beijing 100084, China) Abstract: Word alignment aims to calculate the corresponding relationship between the words in parallel texts. It has important influence on machine translation, bilingual dictionary construction and many other natu...

متن کامل

Unsupervised Word Alignment with Arbitrary Features

We introduce a discriminatively trained, globally normalized, log-linear variant of the lexical translation models proposed by Brown et al. (1993). In our model, arbitrary, nonindependent features may be freely incorporated, thereby overcoming the inherent limitation of generative models, which require that features be sensitive to the conditional independencies of the generative process. Howev...

متن کامل

Discriminative Word Alignment with Syntactic Features

This report introduces a study on syntactic features used in a discriminative word alignment model. The features are implemented on a state-of-the-art discriminative word alignment system. The syntactic features are extracted from parse trees. Three types of syntactic features are experimented in this work: one global tree path feature and two first order tree features. Experimental results sho...

متن کامل

Incorporating Constituent Structure Constraint into Discriminative Word Alignment

We introduce an approach to incorporate the constituent structure constraint into a discriminative word alignment model by presenting the constituent constraint in an explicit way and using three operations to ensure the constraint when search the best word alignment. In this way, we will be able to make use of the weak order constraint induced by the inversion transduction grammars (ITG), as w...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008